High Performance Cache Architectures to Support Dynamic Superscalar Microprocessors

نویسندگان

Kenneth M. Wilson

Kunle Olukotun

چکیده

Simple cache structures are not sufficient to provide the memory bandwidth needed by a dynamic superscalar computer, so more sophisticated memory hierarchies such as non-blocking and pipelined caches are required. To provide direction for the designers of modern high performance microprocessors, we investigate the performance tradeoffs of the combinations of cache size, blocking and non-blocking caches, and pipeline depth of caches within the memory subsystem of a dynamic superscalar processor for integer applications. The results show that the dynamic superscalar processor can hide about two-thirds of the additional latency of two and three pipelined caches, and that a non-blocking cache is always beneficial. A pipelined cache will only outperform a non-pipelined cache if the miss penalty and miss rates are large.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Increasing Cache Port Efficiency for Dynamic Superscalar Microprocessors Kenneth

The memory bandwidth demands of modern microprocessors require the use of a multi-ported cache to achieve peak performance. However, multi-ported caches are costly to implement. In this paper we propose techniques for improving the bandwidth of a single cache port by using additional buffering in the processor, and by taking maximum advantage of a wider cache port. We evaluate these techniques ...

متن کامل

Multiple Branch Prediction for Wide - Issue Superscalar ∗

Modern micro-architectures employ superscalar techniques to enhance system performance. Since the superscalar microprocessors must fetch at least one instruction cache line at a time to support high issue rate and large amount speculative executions. There are cases that multiple branches are often encountered in one cycle. And in practical implementation this would cause serious problem while ...

متن کامل

Process Prefetching for a Simultaneous Multithreaded Architecture

Traditional superscalar architectures shall eventually prove incapable of taking full advantage of billions of transistors to be available in the future generations of microprocessors if they remain limited by dataflow dependencies. Thus, SMT (Simultaneous Multithreaded) architecture may be a possible solution to this problem, as far as it can fetch and execute a great deal of instruction flows...

متن کامل

Evaluation of dynamic branch predictors for modern ILP processors

Modern instruction-level parallel (ILP) processors use superscalar architectures with deep pipelines in order to execute multiple instructions per cycle. The frequency and behavior of branch instructions seriously hinder performance of ILP processors. Various mechanisms, both at the compiler, as well as the processor level, have been proposed to predict the branch behavior. This work investigat...

متن کامل

A complexity-effective microprocessor design with decoupled dispatch queues and prefetching

Continuing demands for high degrees of Instruction Level Parallelism (ILP) require large dispatch queues (or centralized reservation stations) in modern superscalar microprocessors. However, such large dispatch queues are inevitably accompanied by high circuit complexity which would correspondingly limit the pipeline clock rates. In other words, increasing the size of the dispatch queue ultimat...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1998

High Performance Cache Architectures to Support Dynamic Superscalar Microprocessors

نویسندگان

چکیده

منابع مشابه

Increasing Cache Port Efficiency for Dynamic Superscalar Microprocessors Kenneth

Multiple Branch Prediction for Wide - Issue Superscalar ∗

Process Prefetching for a Simultaneous Multithreaded Architecture

Evaluation of dynamic branch predictors for modern ILP processors

A complexity-effective microprocessor design with decoupled dispatch queues and prefetching

عنوان ژورنال:

اشتراک گذاری